{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Bayesian Statistics Made Simple\n",
    "===\n",
    "\n",
    "Code and exercises from my workshop on Bayesian statistics in Python.\n",
    "\n",
    "Copyright 2016 Allen Downey\n",
    "\n",
    "MIT License: https://opensource.org/licenses/MIT"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# If we're running on Colab, install empiricaldist\n",
    "# https://pypi.org/project/empiricaldist/\n",
    "\n",
    "import sys\n",
    "IN_COLAB = 'google.colab' in sys.modules\n",
    "\n",
    "if IN_COLAB:\n",
    "    !pip install empiricaldist"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import pandas as pd\n",
    "\n",
    "import seaborn as sns\n",
    "sns.set_style('white')\n",
    "sns.set_context('talk')\n",
    "\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "from empiricaldist import Pmf"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### The dice problem\n",
    "\n",
    "Create a suite of hypotheses that represents dice with different numbers of sides."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "dice = Pmf.from_seq([4, 6, 8, 12])\n",
    "dice"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Exercise 1:** We'll solve this problem two ways.  First we'll do it \"by hand\", as we did with the cookie problem; that is, we'll multiply each hypothesis by the likelihood of the data, and then renormalize.\n",
    "\n",
    "In the space below, update `dice` based on the likelihood of the data (rolling a 6), then normalize and display the results."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Solution goes here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Exercise 2:**  Now let's do the same calculation using `Pmf.update`, which encodes the structure of a Bayesian update.\n",
    "\n",
    "Define a function called `likelihood_dice` that takes `data` and `hypo` and returns the probability of the data (the outcome of rolling the die) for a given hypothesis (number of sides on the die).\n",
    "\n",
    "Hint: What should you do if the outcome exceeds the hypothetical number of sides on the die?\n",
    "\n",
    "Here's an outline to get you started."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "def likelihood_dice(data, hypo):\n",
    "    \"\"\"Likelihood function for the dice problem.\n",
    "    \n",
    "    data: outcome of the die roll\n",
    "    hypo: number of sides\n",
    "    \n",
    "    returns: float probability\n",
    "    \"\"\"\n",
    "    # TODO: fill this in!\n",
    "    return 1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Solution goes here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we can create a `Pmf` object and update it."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "dice = Pmf.from_seq([4, 6, 8, 12])\n",
    "dice.update(likelihood_dice, 6)\n",
    "dice"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If we get more data, we can perform more updates."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "for roll in [8, 7, 7, 5, 4]:\n",
    "    dice.update(likelihood_dice, roll)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here are the results."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "dice"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### The German tank problem\n",
    "\n",
    "The German tank problem is actually identical to the dice problem."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "def likelihood_tank(data, hypo):\n",
    "    \"\"\"Likelihood function for the tank problem.\n",
    "    \n",
    "    data: observed serial number\n",
    "    hypo: number of tanks\n",
    "    \n",
    "    returns: float probability\n",
    "    \"\"\"\n",
    "    if data > hypo:\n",
    "        return 0\n",
    "    else:\n",
    "        return 1 / hypo"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here is the update after seeing Tank #42."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "tank = Pmf.from_seq(range(100))\n",
    "tank.update(likelihood_tank, 42)\n",
    "tank.mean()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And here's what the posterior distribution looks like."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "def decorate_tank(title):\n",
    "    \"\"\"Labels the axes.\n",
    "    \n",
    "    title: string\n",
    "    \"\"\"\n",
    "    plt.xlabel('Number of tanks')\n",
    "    plt.ylabel('PMF')\n",
    "    plt.title(title)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "tank.plot()\n",
    "decorate_tank('Distribution after one tank')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Exercise 3:**  Suppose we see another tank with serial number 17.  What effect does this have on the posterior probabilities?\n",
    "\n",
    "Update the `Pmf` with the new data and plot the results."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Solution goes here"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
